IV Pizza Pub, a favorite of many UCSB students and IV locals, serves customers rapidly by making a variety of pizzas in advance and allowing patrons to select slices from these premade pizzas, which sit on display. Sometimes, though, a patron's pizza of choice is not on display. This leads the patron to endure a long wait time while their pizza is made, or leads them to select an alternative pizza, leaving them less satisfied. In the worst case, the patron may leave altogether. The goal of our project is to train a Yolov8 model to identify the types of pizza which are on display, using footage from IV Pizza Pub's security cameras.
We were able to access security footage from IV Pizza Pub. Then, using Roboflow, a computer vision software, we were able to label over 100 images taken from this footage. Within this dataset, there are 12 classes (pizza types):
- Big Sur
- Cheese
- Chicken Bacon
- Manresa
- Maui Wowie
- Pepperoni
- Pepperoni Sausage
- Pesto
- Rockaway
- Special Slice
- Sweet Heat
- Veggie
PART 1: Data Labelling and Augmentation.ΒΆ
We hand labelled ~100 images using Roboflow. We also augmented this data to both increase the total number images, and to make our model more robust. We split our dataset 80/10/10 train/val/test split. We want to use as many images as possible for training, as we do not have an excess of images to work with, so the 80/10/10 split is appropriate.
We strategically augmented our data in order to best serve the real usage of this model. To account for the fact that lighting changes throughout the day, we changed the saturation by up to 25%. To account for the fact that pizzas may be in various positions within the cutting board, we rotated images by up to 15%. To account for slight variations in camera quality, we added gaussian noise to up to 0.5% of pixels. Ultimately, we ended up with 367 total images. Due to our rarer pizza types only occuring in images alongside popular images, we were unable to 'balance' our dataset to equally represent all classes. This may effect our model's performance.
PART 2: Importing our Data and Training our ModelΒΆ
First, we will set our directory and import the necessary libraries.
# Set our working directory
import os
HOME = os.getcwd()
# Install necessary libraries
!pip install roboflow
!pip install ultralytics==8.2.103 -q
Collecting roboflow Downloading roboflow-1.1.58-py3-none-any.whl.metadata (9.7 kB) Requirement already satisfied: certifi in /usr/local/lib/python3.11/dist-packages (from roboflow) (2025.1.31) Collecting idna==3.7 (from roboflow) Downloading idna-3.7-py3-none-any.whl.metadata (9.9 kB) Requirement already satisfied: cycler in /usr/local/lib/python3.11/dist-packages (from roboflow) (0.12.1) Requirement already satisfied: kiwisolver>=1.3.1 in /usr/local/lib/python3.11/dist-packages (from roboflow) (1.4.8) Requirement already satisfied: matplotlib in /usr/local/lib/python3.11/dist-packages (from roboflow) (3.10.0) Requirement already satisfied: numpy>=1.18.5 in /usr/local/lib/python3.11/dist-packages (from roboflow) (1.26.4) Collecting opencv-python-headless==4.10.0.84 (from roboflow) Downloading opencv_python_headless-4.10.0.84-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (20 kB) Requirement already satisfied: Pillow>=7.1.2 in /usr/local/lib/python3.11/dist-packages (from roboflow) (11.1.0) Collecting pillow-heif>=0.18.0 (from roboflow) Downloading pillow_heif-0.22.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.6 kB) Requirement already satisfied: python-dateutil in /usr/local/lib/python3.11/dist-packages (from roboflow) (2.8.2) Collecting python-dotenv (from roboflow) Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB) Requirement already satisfied: requests in /usr/local/lib/python3.11/dist-packages (from roboflow) (2.32.3) Requirement already satisfied: six in /usr/local/lib/python3.11/dist-packages (from roboflow) (1.17.0) Requirement already satisfied: urllib3>=1.26.6 in /usr/local/lib/python3.11/dist-packages (from roboflow) (2.3.0) Requirement already satisfied: tqdm>=4.41.0 in /usr/local/lib/python3.11/dist-packages (from roboflow) (4.67.1) Requirement already satisfied: PyYAML>=5.3.1 in /usr/local/lib/python3.11/dist-packages (from roboflow) (6.0.2) Requirement already satisfied: requests-toolbelt in /usr/local/lib/python3.11/dist-packages (from roboflow) (1.0.0) Collecting filetype (from roboflow) Downloading filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB) Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->roboflow) (1.3.1) Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->roboflow) (4.56.0) Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.11/dist-packages (from matplotlib->roboflow) (24.2) Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.11/dist-packages (from matplotlib->roboflow) (3.2.1) Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.11/dist-packages (from requests->roboflow) (3.4.1) Downloading roboflow-1.1.58-py3-none-any.whl (84 kB) ββββββββββββββββββββββββββββββββββββββββ 84.5/84.5 kB 8.0 MB/s eta 0:00:00 Downloading idna-3.7-py3-none-any.whl (66 kB) ββββββββββββββββββββββββββββββββββββββββ 66.8/66.8 kB 6.4 MB/s eta 0:00:00 Downloading opencv_python_headless-4.10.0.84-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (49.9 MB) ββββββββββββββββββββββββββββββββββββββββ 49.9/49.9 MB 17.0 MB/s eta 0:00:00 Downloading pillow_heif-0.22.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB) ββββββββββββββββββββββββββββββββββββββββ 7.8/7.8 MB 88.9 MB/s eta 0:00:00 Downloading filetype-1.2.0-py2.py3-none-any.whl (19 kB) Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB) Installing collected packages: filetype, python-dotenv, pillow-heif, opencv-python-headless, idna, roboflow Attempting uninstall: opencv-python-headless Found existing installation: opencv-python-headless 4.11.0.86 Uninstalling opencv-python-headless-4.11.0.86: Successfully uninstalled opencv-python-headless-4.11.0.86 Attempting uninstall: idna Found existing installation: idna 3.10 Uninstalling idna-3.10: Successfully uninstalled idna-3.10 Successfully installed filetype-1.2.0 idna-3.7 opencv-python-headless-4.10.0.84 pillow-heif-0.22.0 python-dotenv-1.0.1 roboflow-1.1.58 ββββββββββββββββββββββββββββββββββββββββ 875.1/875.1 kB 43.5 MB/s eta 0:00:00 ββββββββββββββββββββββββββββββββββββββββ 363.4/363.4 MB 3.8 MB/s eta 0:00:00 ββββββββββββββββββββββββββββββββββββββββ 13.8/13.8 MB 51.2 MB/s eta 0:00:00 ββββββββββββββββββββββββββββββββββββββββ 24.6/24.6 MB 32.6 MB/s eta 0:00:00 ββββββββββββββββββββββββββββββββββββββββ 883.7/883.7 kB 44.1 MB/s eta 0:00:00 ββββββββββββββββββββββββββββββββββββββββ 664.8/664.8 MB 2.5 MB/s eta 0:00:00 ββββββββββββββββββββββββββββββββββββββββ 211.5/211.5 MB 5.8 MB/s eta 0:00:00 ββββββββββββββββββββββββββββββββββββββββ 56.3/56.3 MB 12.4 MB/s eta 0:00:00 ββββββββββββββββββββββββββββββββββββββββ 127.9/127.9 MB 7.4 MB/s eta 0:00:00 ββββββββββββββββββββββββββββββββββββββββ 207.5/207.5 MB 6.4 MB/s eta 0:00:00 ββββββββββββββββββββββββββββββββββββββββ 21.1/21.1 MB 95.1 MB/s eta 0:00:00
# Import necessary packages
from IPython import display
display.clear_output()
import ultralytics
ultralytics.checks()
from ultralytics import YOLO
from IPython.display import display, Image
Ultralytics YOLOv8.2.103 π Python-3.11.11 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB) Setup complete β (2 CPUs, 12.7 GB RAM, 41.1/112.6 GB disk)
Now, we will train a Yolov8 model on our annotated data. We chose the Yolov8 model as it is the highest performing (free) model which Roboflow is compatible with. We will use 100 epochs, as it this is a standard number of epochs in which to train Yolov8. However, we will set patience to 15, so if the model does not improve after 15 epochs, it will stop training. This technique is called Early Stopping.
# Import our dataset using Roboflow's API
from roboflow import Roboflow
rf = Roboflow(api_key="----")
project = rf.workspace("pizza-pub-project-hdqsn").project("my-first-project-dwhuu")
version = project.version(11)
dataset = version.download("yolov8")
loading Roboflow workspace... loading Roboflow project...
Downloading Dataset Version Zip in My-First-Project-11 to yolov8:: 100%|ββββββββββ| 17352/17352 [00:02<00:00, 7090.11it/s]
Extracting Dataset Version Zip to My-First-Project-11 in yolov8:: 100%|ββββββββββ| 746/746 [00:00<00:00, 7115.99it/s]
# Train our model
%cd {HOME}
epochs = 100
!yolo task=detect mode=train model=yolov8s.pt data={dataset.location}/data.yaml epochs={epochs} plots=True patience=15
/content Downloading https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt to 'yolov8s.pt'... 100% 21.5M/21.5M [00:00<00:00, 42.1MB/s] New https://pypi.org/project/ultralytics/8.3.91 available π Update with 'pip install -U ultralytics' Ultralytics YOLOv8.2.103 π Python-3.11.11 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB) engine/trainer: task=detect, mode=train, model=yolov8s.pt, data=/content/My-First-Project-11/data.yaml, epochs=100, time=None, patience=15, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=True, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=runs/detect/train Downloading https://ultralytics.com/assets/Arial.ttf to '/root/.config/Ultralytics/Arial.ttf'... 100% 755k/755k [00:00<00:00, 109MB/s] WARNING: All log messages before absl::InitializeLog() is called are written to STDERR E0000 00:00:1742171626.289182 1659 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered E0000 00:00:1742171626.358832 1659 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered Overriding model.yaml nc=80 with nc=12 from n params module arguments 0 -1 1 928 ultralytics.nn.modules.conv.Conv [3, 32, 3, 2] 1 -1 1 18560 ultralytics.nn.modules.conv.Conv [32, 64, 3, 2] 2 -1 1 29056 ultralytics.nn.modules.block.C2f [64, 64, 1, True] 3 -1 1 73984 ultralytics.nn.modules.conv.Conv [64, 128, 3, 2] 4 -1 2 197632 ultralytics.nn.modules.block.C2f [128, 128, 2, True] 5 -1 1 295424 ultralytics.nn.modules.conv.Conv [128, 256, 3, 2] 6 -1 2 788480 ultralytics.nn.modules.block.C2f [256, 256, 2, True] 7 -1 1 1180672 ultralytics.nn.modules.conv.Conv [256, 512, 3, 2] 8 -1 1 1838080 ultralytics.nn.modules.block.C2f [512, 512, 1, True] 9 -1 1 656896 ultralytics.nn.modules.block.SPPF [512, 512, 5] 10 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 11 [-1, 6] 1 0 ultralytics.nn.modules.conv.Concat [1] 12 -1 1 591360 ultralytics.nn.modules.block.C2f [768, 256, 1] 13 -1 1 0 torch.nn.modules.upsampling.Upsample [None, 2, 'nearest'] 14 [-1, 4] 1 0 ultralytics.nn.modules.conv.Concat [1] 15 -1 1 148224 ultralytics.nn.modules.block.C2f [384, 128, 1] 16 -1 1 147712 ultralytics.nn.modules.conv.Conv [128, 128, 3, 2] 17 [-1, 12] 1 0 ultralytics.nn.modules.conv.Concat [1] 18 -1 1 493056 ultralytics.nn.modules.block.C2f [384, 256, 1] 19 -1 1 590336 ultralytics.nn.modules.conv.Conv [256, 256, 3, 2] 20 [-1, 9] 1 0 ultralytics.nn.modules.conv.Concat [1] 21 -1 1 1969152 ultralytics.nn.modules.block.C2f [768, 512, 1] 22 [15, 18, 21] 1 2120692 ultralytics.nn.modules.head.Detect [12, [128, 256, 512]] Model summary: 225 layers, 11,140,244 parameters, 11,140,228 gradients, 28.7 GFLOPs Transferred 349/355 items from pretrained weights TensorBoard: Start with 'tensorboard --logdir runs/detect/train', view at http://localhost:6006/ Freezing layer 'model.22.dfl.conv.weight' AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n... Downloading https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt to 'yolov8n.pt'... 100% 6.25M/6.25M [00:00<00:00, 306MB/s] AMP: checks passed β train: Scanning /content/My-First-Project-11/train/labels... 339 images, 6 backgrounds, 0 corrupt: 100% 339/339 [00:00<00:00, 1790.12it/s] train: New cache created: /content/My-First-Project-11/train/labels.cache WARNING β οΈ Box and segment counts should be equal, but got len(segments) = 1752, len(boxes) = 1812. To resolve this only boxes will be used and all segments will be removed. To avoid this please supply either a detect or segment dataset, not a detect-segment mixed dataset. /usr/local/lib/python3.11/dist-packages/ultralytics/data/augment.py:1837: UserWarning: Argument(s) 'quality_lower' are not valid for transform ImageCompression A.ImageCompression(quality_lower=75, p=0.0), albumentations: Blur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, num_output_channels=3, method='weighted_average'), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8)) val: Scanning /content/My-First-Project-11/valid/labels... 14 images, 0 backgrounds, 0 corrupt: 100% 14/14 [00:00<00:00, 985.93it/s] val: New cache created: /content/My-First-Project-11/valid/labels.cache WARNING β οΈ Box and segment counts should be equal, but got len(segments) = 76, len(boxes) = 80. To resolve this only boxes will be used and all segments will be removed. To avoid this please supply either a detect or segment dataset, not a detect-segment mixed dataset. Plotting labels to runs/detect/train/labels.jpg... optimizer: 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... optimizer: AdamW(lr=0.000625, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0) TensorBoard: model graph visualization added β Image sizes 640 train, 640 val Using 2 dataloader workers Logging results to runs/detect/train Starting training for 100 epochs... Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 1/100 3.95G 2.157 5.037 1.949 22 640: 100% 22/22 [00:09<00:00, 2.27it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:01<00:00, 1.49s/it] all 14 80 0.129 0.297 0.065 0.0317 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 2/100 4.04G 1.726 3.078 1.599 28 640: 100% 22/22 [00:06<00:00, 3.17it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 3.27it/s] all 14 80 0.168 0.412 0.132 0.0732 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 3/100 4.08G 1.602 2.7 1.548 28 640: 100% 22/22 [00:06<00:00, 3.40it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 3.63it/s] all 14 80 0.448 0.257 0.191 0.107 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 4/100 4.06G 1.52 2.465 1.482 21 640: 100% 22/22 [00:06<00:00, 3.52it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.97it/s] all 14 80 0.557 0.283 0.276 0.15 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 5/100 3.98G 1.465 2.315 1.445 36 640: 100% 22/22 [00:06<00:00, 3.25it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.38it/s] all 14 80 0.27 0.492 0.247 0.132 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 6/100 4.01G 1.444 2.231 1.434 12 640: 100% 22/22 [00:06<00:00, 3.57it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.65it/s] all 14 80 0.476 0.333 0.181 0.0999 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 7/100 4.05G 1.397 2.11 1.422 12 640: 100% 22/22 [00:06<00:00, 3.17it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.02it/s] all 14 80 0.399 0.306 0.277 0.169 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 8/100 4.06G 1.374 2.014 1.406 7 640: 100% 22/22 [00:06<00:00, 3.43it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 3.47it/s] all 14 80 0.35 0.334 0.279 0.156 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 9/100 4.03G 1.346 1.879 1.357 11 640: 100% 22/22 [00:06<00:00, 3.56it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.79it/s] all 14 80 0.295 0.372 0.313 0.197 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 10/100 4.06G 1.31 1.871 1.369 17 640: 100% 22/22 [00:06<00:00, 3.23it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 3.83it/s] all 14 80 0.43 0.334 0.346 0.206 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 11/100 3.99G 1.289 1.772 1.301 14 640: 100% 22/22 [00:06<00:00, 3.49it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 6.12it/s] all 14 80 0.319 0.502 0.345 0.214 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 12/100 4.04G 1.26 1.781 1.336 21 640: 100% 22/22 [00:06<00:00, 3.23it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 6.18it/s] all 14 80 0.4 0.559 0.451 0.282 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 13/100 3.9G 1.233 1.604 1.288 47 640: 100% 22/22 [00:06<00:00, 3.41it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 2.94it/s] all 14 80 0.429 0.511 0.489 0.279 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 14/100 3.92G 1.23 1.556 1.298 21 640: 100% 22/22 [00:06<00:00, 3.55it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.79it/s] all 14 80 0.397 0.539 0.427 0.24 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 15/100 4.03G 1.187 1.517 1.286 14 640: 100% 22/22 [00:06<00:00, 3.15it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 6.18it/s] all 14 80 0.557 0.471 0.532 0.344 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 16/100 4.07G 1.194 1.495 1.26 19 640: 100% 22/22 [00:06<00:00, 3.55it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.59it/s] all 14 80 0.587 0.455 0.47 0.297 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 17/100 3.85G 1.16 1.414 1.233 20 640: 100% 22/22 [00:06<00:00, 3.28it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 3.99it/s] all 14 80 0.37 0.602 0.447 0.253 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 18/100 3.91G 1.127 1.363 1.239 50 640: 100% 22/22 [00:06<00:00, 3.32it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 2.81it/s] all 14 80 0.309 0.596 0.46 0.265 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 19/100 3.9G 1.11 1.355 1.225 31 640: 100% 22/22 [00:06<00:00, 3.59it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 6.25it/s] all 14 80 0.546 0.394 0.439 0.275 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 20/100 3.93G 1.142 1.297 1.225 37 640: 100% 22/22 [00:06<00:00, 3.24it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.19it/s] all 14 80 0.423 0.503 0.379 0.237 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 21/100 4.06G 1.115 1.308 1.216 13 640: 100% 22/22 [00:06<00:00, 3.60it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 3.71it/s] all 14 80 0.479 0.487 0.52 0.327 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 22/100 3.87G 1.081 1.219 1.183 20 640: 100% 22/22 [00:06<00:00, 3.19it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.34it/s] all 14 80 0.369 0.608 0.459 0.298 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 23/100 3.85G 1.029 1.203 1.162 11 640: 100% 22/22 [00:06<00:00, 3.37it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 3.14it/s] all 14 80 0.382 0.415 0.457 0.287 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 24/100 3.92G 1.045 1.17 1.184 17 640: 100% 22/22 [00:06<00:00, 3.54it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.95it/s] all 14 80 0.657 0.452 0.49 0.303 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 25/100 3.91G 1.043 1.141 1.174 30 640: 100% 22/22 [00:06<00:00, 3.22it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.28it/s] all 14 80 0.59 0.585 0.586 0.366 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 26/100 4.06G 1.026 1.082 1.154 27 640: 100% 22/22 [00:06<00:00, 3.56it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 6.14it/s] all 14 80 0.487 0.556 0.479 0.295 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 27/100 3.86G 1.012 1.047 1.148 19 640: 100% 22/22 [00:06<00:00, 3.18it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.16it/s] all 14 80 0.45 0.551 0.501 0.298 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 28/100 4.05G 0.993 1.03 1.145 19 640: 100% 22/22 [00:06<00:00, 3.38it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 3.15it/s] all 14 80 0.714 0.278 0.489 0.315 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 29/100 4.04G 0.967 1.001 1.125 14 640: 100% 22/22 [00:06<00:00, 3.59it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.14it/s] all 14 80 0.54 0.461 0.573 0.357 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 30/100 4.04G 0.9769 1.022 1.12 48 640: 100% 22/22 [00:06<00:00, 3.15it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.44it/s] all 14 80 0.583 0.481 0.534 0.338 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 31/100 4.07G 0.976 1.041 1.128 21 640: 100% 22/22 [00:06<00:00, 3.54it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.62it/s] all 14 80 0.534 0.467 0.545 0.355 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 32/100 4.03G 0.9804 1.032 1.142 24 640: 100% 22/22 [00:06<00:00, 3.28it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.28it/s] all 14 80 0.519 0.482 0.558 0.344 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 33/100 4G 0.9873 1.005 1.131 32 640: 100% 22/22 [00:06<00:00, 3.34it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 2.95it/s] all 14 80 0.545 0.535 0.583 0.368 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 34/100 3.93G 0.9222 0.894 1.095 11 640: 100% 22/22 [00:06<00:00, 3.54it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.16it/s] all 14 80 0.396 0.626 0.524 0.345 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 35/100 3.99G 0.9448 0.9388 1.106 23 640: 100% 22/22 [00:06<00:00, 3.28it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.30it/s] all 14 80 0.387 0.569 0.567 0.378 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 36/100 3.87G 0.9182 0.9265 1.082 25 640: 100% 22/22 [00:06<00:00, 3.55it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.68it/s] all 14 80 0.489 0.543 0.587 0.388 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 37/100 4.04G 0.9153 0.9519 1.082 26 640: 100% 22/22 [00:06<00:00, 3.36it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.37it/s] all 14 80 0.519 0.455 0.538 0.369 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 38/100 4.04G 0.8698 0.8434 1.067 13 640: 100% 22/22 [00:06<00:00, 3.30it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 2.80it/s] all 14 80 0.464 0.47 0.475 0.313 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 39/100 4.03G 0.893 0.8311 1.076 30 640: 100% 22/22 [00:06<00:00, 3.59it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.88it/s] all 14 80 0.524 0.411 0.452 0.28 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 40/100 3.91G 0.8764 0.8394 1.076 23 640: 100% 22/22 [00:06<00:00, 3.24it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.08it/s] all 14 80 0.484 0.525 0.473 0.291 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 41/100 4.05G 0.8742 0.8356 1.089 18 640: 100% 22/22 [00:06<00:00, 3.58it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.79it/s] all 14 80 0.429 0.428 0.496 0.334 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 42/100 4.03G 0.8633 0.7888 1.063 39 640: 100% 22/22 [00:06<00:00, 3.27it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.24it/s] all 14 80 0.576 0.515 0.538 0.339 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 43/100 3.9G 0.8579 0.8076 1.061 50 640: 100% 22/22 [00:06<00:00, 3.32it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 3.38it/s] all 14 80 0.473 0.568 0.53 0.339 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 44/100 4.07G 0.8464 0.7921 1.057 28 640: 100% 22/22 [00:06<00:00, 3.60it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.92it/s] all 14 80 0.665 0.445 0.575 0.383 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 45/100 4.04G 0.8421 0.7717 1.051 40 640: 100% 22/22 [00:06<00:00, 3.25it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.33it/s] all 14 80 0.513 0.53 0.504 0.332 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 46/100 3.88G 0.8597 0.7869 1.057 21 640: 100% 22/22 [00:06<00:00, 3.51it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.32it/s] all 14 80 0.457 0.413 0.471 0.312 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 47/100 3.86G 0.8029 0.7396 1.015 7 640: 100% 22/22 [00:06<00:00, 3.29it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 6.70it/s] all 14 80 0.597 0.376 0.437 0.294 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 48/100 3.92G 0.812 0.7551 1.034 14 640: 100% 22/22 [00:06<00:00, 3.33it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 2.92it/s] all 14 80 0.535 0.435 0.465 0.293 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 49/100 4.02G 0.7704 0.7171 1.004 32 640: 100% 22/22 [00:06<00:00, 3.59it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.40it/s] all 14 80 0.56 0.483 0.549 0.337 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 50/100 3.88G 0.7809 0.7106 1.017 28 640: 100% 22/22 [00:07<00:00, 3.14it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 5.70it/s] all 14 80 0.651 0.475 0.548 0.353 Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size 51/100 4.03G 0.7782 0.7268 1.022 28 640: 100% 22/22 [00:06<00:00, 3.57it/s] Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 6.22it/s] all 14 80 0.59 0.484 0.514 0.314 EarlyStopping: Training stopped early as no improvement observed in last 15 epochs. Best results observed at epoch 36, best model saved as best.pt. To update EarlyStopping(patience=15) pass a new patience value, i.e. `patience=300` or use `patience=0` to disable EarlyStopping. 51 epochs completed in 0.112 hours. Optimizer stripped from runs/detect/train/weights/last.pt, 22.5MB Optimizer stripped from runs/detect/train/weights/best.pt, 22.5MB Validating runs/detect/train/weights/best.pt... Ultralytics YOLOv8.2.103 π Python-3.11.11 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB) Model summary (fused): 168 layers, 11,130,228 parameters, 0 gradients, 28.5 GFLOPs Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 4.19it/s] all 14 80 0.491 0.551 0.592 0.389 Big Sur 4 5 0.517 0.4 0.477 0.22 Cheese 11 12 0.545 0.583 0.572 0.383 Chicken Bacon 10 11 0.443 0.727 0.524 0.214 Manresa 4 4 0.279 0.5 0.572 0.417 Maui Wowie 8 9 0.538 0.776 0.709 0.45 Pepperoni 11 11 0.652 0.818 0.768 0.545 Pepperoni Sausage 5 6 0.427 1 0.776 0.49 Pesto 9 10 0.78 0.8 0.801 0.634 Rockaway 5 6 0.259 0.176 0.267 0.112 Special Slice 1 1 1 0 0.995 0.796 Sweet Heat 2 2 0.229 0.5 0.236 0.129 Veggie 3 3 0.23 0.333 0.408 0.281 Speed: 0.2ms preprocess, 6.0ms inference, 0.0ms loss, 2.5ms postprocess per image Results saved to runs/detect/train π‘ Learn more at https://docs.ultralytics.com/modes/train
The various graphs below are helpful visualizations, displaying how our model improves (our loss drops) throughout the training process. Our best training epoch was epoch 36. At training Epoch 36, Precision is 0.489, and Recall is 0.543 for our training data. We will test out our model on unseen data (our test set) later in the notebook.
%cd {HOME}
Image(filename=f'{HOME}/runs/detect/train/results.png', width=600)
/content
PART 3: Evaluating our ModelΒΆ
We can now evaluate our models performance on the valuation and test sets. This confusion matrix below shows us the relationship between predicted and true classes in our validation set. As we can see, Cheese and Pepperonie are often accurately predicted. However, it looks like there are a lot of slices in which the model predicts them to be background, rather than pizza. This means that there are a lot of unidentified pizza slices. This may be due to a lack of training data, or the fact that blurry images of pizza look similar to the backround, due to their similar colors.
%cd {HOME}
Image(filename=f'{HOME}/runs/detect/train/confusion_matrix.png', width=600)
/content
Next, we can see how our model performs on unseen data. For this, we will use the test set. We can see that for our best model, the Precision is 0.68, and the Recall is 0.508. This is pretty good.
%cd {HOME}
!yolo task=detect mode=val model={HOME}/runs/detect/train/weights/best.pt data={dataset.location}/data.yaml split=test
/content Ultralytics YOLOv8.2.103 π Python-3.11.11 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB) Model summary (fused): 168 layers, 11,130,228 parameters, 0 gradients, 28.5 GFLOPs val: Scanning /content/My-First-Project-11/test/labels... 14 images, 0 backgrounds, 0 corrupt: 100% 14/14 [00:00<00:00, 1137.35it/s] val: New cache created: /content/My-First-Project-11/test/labels.cache WARNING β οΈ Box and segment counts should be equal, but got len(segments) = 74, len(boxes) = 79. To resolve this only boxes will be used and all segments will be removed. To avoid this please supply either a detect or segment dataset, not a detect-segment mixed dataset. Class Images Instances Box(P R mAP50 mAP50-95): 100% 1/1 [00:00<00:00, 1.92it/s] all 14 79 0.68 0.508 0.602 0.376 Big Sur 2 3 0.884 0.333 0.494 0.284 Cheese 9 10 0.464 0.3 0.465 0.313 Chicken Bacon 8 8 0.423 0.75 0.58 0.396 Manresa 5 5 0.666 0.6 0.72 0.476 Maui Wowie 11 13 1 0.603 0.745 0.434 Pepperoni 11 11 0.862 0.909 0.951 0.723 Pepperoni Sausage 6 8 0.73 0.346 0.495 0.316 Pesto 8 8 0.709 0.5 0.579 0.397 Rockaway 4 4 0.859 0.75 0.753 0.541 Special Slice 1 2 1 0 0.106 0.0528 Sweet Heat 2 2 0.567 1 0.995 0.4 Veggie 5 5 0 0 0.339 0.179 Speed: 0.4ms preprocess, 12.4ms inference, 0.0ms loss, 11.6ms postprocess per image Results saved to runs/detect/val π‘ Learn more at https://docs.ultralytics.com/modes/val
Below, we can examine the bounding boxes produced by our model. We will run inference on the test set check out some of the model's predictions. Feel free to change the number in image_paths to view various images!
# run inference on the test set, save images
%cd {HOME}
!yolo task=detect mode=predict model={HOME}/runs/detect/train/weights/best.pt conf=0.25 source={dataset.location}/test/images save=True
/content
Ultralytics YOLOv8.2.103 π Python-3.11.11 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
Model summary (fused): 168 layers, 11,130,228 parameters, 0 gradients, 28.5 GFLOPs
image 1/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-145513_png.rf.ac3621a89632fdd939f865860e039b98.jpg: 640x640 1 Chicken Bacon, 1 Pepperoni, 1 Sweet Heat, 16.3ms
image 2/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-145933_png.rf.ddbf7b7dda759b5f09718cc2515c46e6.jpg: 640x640 2 Chicken Bacons, 1 Maui Wowie, 1 Pepperoni, 1 Rockaway, 1 Sweet Heat, 16.3ms
image 3/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-145942_png.rf.3407befbf8b18ad58d48e1422c63db12.jpg: 640x640 1 Cheese, 1 Chicken Bacon, 1 Manresa, 1 Pepperoni, 16.2ms
image 4/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-145957_png.rf.195ef973c66870debc99a95817ac692d.jpg: 640x640 1 Pepperoni, 2 Pestos, 16.2ms
image 5/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-150059_png.rf.c2d70af174d1bd37b472ce5dfbd97abd.jpg: 640x640 1 Chicken Bacon, 1 Maui Wowie, 2 Pepperonis, 1 Pesto, 16.3ms
image 6/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-150217_png.rf.163b6620bf467cbeac20d455211e16b1.jpg: 640x640 1 Chicken Bacon, 1 Maui Wowie, 16.2ms
image 7/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-150232_png.rf.549363b55ceabe5aa064ec50ca70f9f7.jpg: 640x640 1 Cheese, 1 Chicken Bacon, 1 Maui Wowie, 1 Pepperoni, 1 Pepperoni Sausage, 1 Pesto, 1 Rockaway, 16.2ms
image 8/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-150240_png.rf.df6d054ccf0a4ae38b6d2e2defef9dba.jpg: 640x640 2 Chicken Bacons, 1 Manresa, 1 Maui Wowie, 2 Pepperonis, 1 Pepperoni Sausage, 13.6ms
image 9/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-150345_png.rf.fceab735142a5c39335f4990960dd435.jpg: 640x640 (no detections), 13.2ms
image 10/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-150641_png.rf.1440310b3111a8365f56da42128fd577.jpg: 640x640 1 Big Sur, 3 Chicken Bacons, 1 Manresa, 1 Pepperoni, 13.2ms
image 11/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-151120_png.rf.3b008d5131c656a1c22074ee14b5765b.jpg: 640x640 1 Cheese, 1 Manresa, 1 Pesto, 1 Sweet Heat, 13.2ms
image 12/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-151541_png.rf.97758f6d0a36a578f3f3706c0d150218.jpg: 640x640 1 Cheese, 1 Chicken Bacon, 1 Maui Wowie, 1 Sweet Heat, 1 Veggie, 13.2ms
image 13/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-151702_png.rf.a99db49e7036f945860c17f46964ad10.jpg: 640x640 1 Big Sur, 1 Chicken Bacon, 1 Manresa, 1 Pepperoni Sausage, 13.2ms
image 14/14 /content/My-First-Project-11/test/images/Screenshot-2025-03-09-152055_png.rf.1acdc3409f2a9fcf78022c421e8f2979.jpg: 640x640 1 Cheese, 1 Chicken Bacon, 2 Pepperonis, 1 Pepperoni Sausage, 13.2ms
Speed: 1.9ms preprocess, 14.8ms inference, 13.3ms postprocess per image at shape (1, 3, 640, 640)
Results saved to runs/detect/predict
π‘ Learn more at https://docs.ultralytics.com/modes/predict
import glob
from IPython.display import Image, display
# Define the base path where the folders are located
base_path = '/content/runs/detect/'
# List all directories that start with 'predict' in the base path
subfolders = [os.path.join(base_path, d) for d in os.listdir(base_path)
if os.path.isdir(os.path.join(base_path, d)) and d.startswith('predict')]
# Find the latest folder by modification time
latest_folder = max(subfolders, key=os.path.getmtime)
image_paths = glob.glob(f'{latest_folder}/*.jpg')[:3] # <- CHANGE ME TO SEE OTHER IMAGES
# Display each image
for image_path in image_paths:
display(Image(filename=image_path, width=600))
print("\n")
PART 4: Conclusion and Looking ForwardΒΆ
Overall, our IV Pizza Pub Pizza Identifier shows a use case of computer vision for business purposes. Going forward, our goal would be to serve this model and connect it to the security cameras at IV Pizza Pub. Then, we could host a website in which patrons could view the available pizza options before they visit the store, and request the pizza which they want if it is not already made, improving customer satisfaction and waiting times.
As is, this model is not accurate enough to consistently identify the pizzas in the image for this business use. However, there are multiple improvements we could make in order to achieve this necessary accuracy. If IV Pizza Pub were to make an investment in higher quality cameras, the images could be less blurry, and the model's accuracy would increase. Secondly, we could process more footage and increase the amount of training data drastically to make the model more robust. Specifically, it would be helpful to have more training examples of less common pizza types. This would 'balance' the dataset and help to make the model more robust.
Thank you for checking out our IV Pizza Pub Pizza Identifier!